Goto

Collaborating Authors

 crystal structure prediction


Crystal Structure Prediction by Joint Equivariant Diffusion

Neural Information Processing Systems

Crystal Structure Prediction (CSP) is crucial in various scientific disciplines. While CSP can be addressed by employing currently-prevailing generative models ( e.g.


OXtal: An All-Atom Diffusion Model for Organic Crystal Structure Prediction

Jin, Emily, Nica, Andrei Cristian, Galkin, Mikhail, Rector-Brooks, Jarrid, Lee, Kin Long Kelvin, Miret, Santiago, Arnold, Frances H., Bronstein, Michael, Bose, Avishek Joey, Tong, Alexander, Liu, Cheng-Hao

arXiv.org Artificial Intelligence

Accurately predicting experimentally-realizable 3D molecular crystal structures from their 2D chemical graphs is a long-standing open challenge in computational chemistry called crystal structure prediction (CSP). Efficiently solving this problem has implications ranging from pharmaceuticals to organic semiconductors, as crystal packing directly governs the physical and chemical properties of organic solids. In this paper, we introduce OXtal, a large-scale 100M parameter all-atom diffusion model that directly learns the conditional joint distribution over intramolecular conformations and periodic packing. To efficiently scale OXtal, we abandon explicit equivariant architectures imposing inductive bias arising from crystal symmetries in favor of data augmentation strategies. We further propose a novel crystallization-inspired lattice-free training scheme, Stoichiometric Stochastic Shell Sampling ($S^4$), that efficiently captures long-range interactions while sidestepping explicit lattice parametrization -- thus enabling more scalable architectural choices at all-atom resolution. By leveraging a large dataset of 600K experimentally validated crystal structures (including rigid and flexible molecules, co-crystals, and solvates), OXtal achieves orders-of-magnitude improvements over prior ab initio machine learning CSP methods, while remaining orders of magnitude cheaper than traditional quantum-chemical approaches. Specifically, OXtal recovers experimental structures with conformer $\text{RMSD}_1<0.5$ Å and attains over 80\% packing similarity rate, demonstrating its ability to model both thermodynamic and kinetic regularities of molecular crystallization.


Equivariant Diffusion for Crystal Structure Prediction

Lin, Peijia, Chen, Pin, Jiao, Rui, Mo, Qing, Cen, Jianhuan, Huang, Wenbing, Liu, Yang, Huang, Dan, Lu, Yutong

arXiv.org Artificial Intelligence

In addressing the challenge of Crystal Structure Prediction (CSP), symmetry-aware deep learning models, particularly diffusion models, have been extensively studied, which treat CSP as a conditional generation task. However, ensuring permutation, rotation, and periodic translation equivariance during diffusion process remains incompletely addressed. In this work, we propose EquiCSP, a novel equivariant diffusion-based generative model. We not only address the overlooked issue of lattice permutation equivariance in existing models, but also develop a unique noising algorithm that rigorously maintains periodic translation equivariance throughout both training and inference processes. Our experiments indicate that EquiCSP significantly surpasses existing models in terms of generating accurate structures and demonstrates faster convergence during the training process.


MXtalTools: A Toolkit for Machine Learning on Molecular Crystals

Kilgour, Michael, Tuckerman, Mark E., Rogal, Jutta

arXiv.org Artificial Intelligence

We present MXtalTools, a flexible Python package for the data-driven modelling of molecular crystals, facilitating machine learning studies of the molecular solid state. MXtalTools comprises several classes of utilities: (1) synthesis, collation, and curation of molecule and crystal datasets, (2) integrated workflows for model training and inference, (3) crystal parameterization and representation, (4) crystal structure sampling and optimization, (5) end-to-end differentiable crystal sampling, construction and analysis. Our modular functions can be integrated into existing workflows or combined and used to build novel modelling pipelines. MXtalTools leverages CUDA acceleration to enable high-throughput crystal modelling. The Python code is available open-source on our GitHub page, with detailed documentation on ReadTheDocs.


Challenges in Non-Polymeric Crystal Structure Prediction: Why a Geometric, Permutation-Invariant Loss is Needed

Jehanno, Emmanuel, Menegaux, Romain, Mairal, Julien, Grudinin, Sergei

arXiv.org Artificial Intelligence

Crystalline structure prediction is an essential prerequisite for designing materials with targeted properties. Yet, it is still an open challenge in materials design and drug discovery. Despite recent advances in computational materials science, accurately predicting three-dimensional non-polymeric crystal structures remains elusive. In this work, we focus on the molecular assembly problem, where a set $\mathcal{S}$ of identical rigid molecules is packed to form a crystalline structure. Such a simplified formulation provides a useful approximation to the actual problem. However, while recent state-of-the-art methods have increasingly adopted sophisticated techniques, the underlying learning objective remains ill-posed. We propose a better formulation that introduces a loss function capturing key geometric molecular properties while ensuring permutation invariance over $\mathcal{S}$. Remarkably, we demonstrate that within this framework, a simple regression model already outperforms prior approaches, including flow matching techniques, on the COD-Cluster17 benchmark, a curated non-polymeric subset of the Crystallography Open Database (COD).


FastCSP: Accelerated Molecular Crystal Structure Prediction with Universal Model for Atoms

Gharakhanyan, Vahe, Yang, Yi, Barroso-Luque, Luis, Shuaibi, Muhammed, Levine, Daniel S., Michel, Kyle, Bernat, Viachaslau, Dzamba, Misko, Fu, Xiang, Gao, Meng, Liu, Xingyu, Noori, Keian, Purvis, Lafe J., Rao, Tingling, Wood, Brandon M., Rizvi, Ammar, Uyttendaele, Matt, Ouderkirk, Andrew J., Daraio, Chiara, Zitnick, C. Lawrence, Boromand, Arman, Marom, Noa, Ulissi, Zachary W., Sriram, Anuroop

arXiv.org Artificial Intelligence

Crystal Structure Prediction (CSP) of molecular crystals plays a central role in applications, such as pharmaceuticals and organic electronics. CSP is challenging and computationally expensive due to the need to explore a large search space with sufficient accuracy to capture energy differences of a few kJ/mol between polymorphs. Dispersion-inclusive density functional theory (DFT) provides the required accuracy but its computational cost is impractical for a large number of putative structures. We introduce FastCSP, an open-source, high-throughput CSP workflow based on machine learning interatomic potentials (MLIPs). FastCSP combines random structure generation using Genarris 3.0 with geometry relaxation and free energy calculations powered entirely by the Universal Model for Atoms (UMA) MLIP. We benchmark FastCSP on a curated set of 28 mostly rigid molecules, demonstrating that our workflow consistently generates known experimental structures and ranks them within 5 kJ/mol per molecule of the global minimum. Our results demonstrate that universal MLIPs can be used across diverse compounds without requiring system-specific tuning. Moreover, the speed and accuracy afforded by UMA eliminate the need for classical force fields in the early stages of CSP and for final re-ranking with DFT. The open-source release of the entire FastCSP workflow significantly lowers the barrier to accessing CSP. CSP results for a single system can be obtained within hours on tens of modern GPUs, making high-throughput crystal structure prediction feasible for a broad range of scientific applications.


Polymorphism Crystal Structure Prediction with Adaptive Space Group Diversity Control

Omee, Sadman Sadeed, Wei, Lai, Dey, Sourin, Hu, Jianjun

arXiv.org Artificial Intelligence

Crystalline materials can form different structural arrangements (i.e. polymorphs) with the same chemical composition, exhibiting distinct physical properties depending on how they were synthesized or the conditions under which they operate. For example, carbon can exist as graphite (soft, conductive) or diamond (hard, insulating). Computational methods that can predict these polymorphs are vital in materials science, which help understand stability relationships, guide synthesis efforts, and discover new materials with desired properties without extensive trial-and-error experimentation. However, effective crystal structure prediction (CSP) algorithms for inorganic polymorph structures remain limited. We propose ParetoCSP2, a multi-objective genetic algorithm for polymorphism CSP that incorporates an adaptive space group diversity control technique, preventing over-representation of any single space group in the population guided by a neural network interatomic potential. Using an improved population initialization method and performing iterative structure relaxation, ParetoCSP2 not only alleviates premature convergence but also achieves improved convergence speed. Our results show that ParetoCSP2 achieves excellent performance in polymorphism prediction, including a nearly perfect space group and structural similarity accuracy for formulas with two polymorphs but with the same number of unit cell atoms. Evaluated on a benchmark dataset, it outperforms baseline algorithms by factors of 2.46-8.62 for these accuracies and improves by 44.8\%-87.04\% across key performance metrics for regular CSP. Our source code is freely available at https://github.com/usccolumbia/ParetoCSP2.


CRYSIM: Prediction of Symmetric Structures of Large Crystals with GPU-based Ising Machines

Liang, Chen, Das, Diptesh, Guo, Jiang, Tamura, Ryo, Mao, Zetian, Tsuda, Koji

arXiv.org Artificial Intelligence

Solving black-box optimization problems with Ising machines is increasingly common in materials science. However, their application to crystal structure prediction (CSP) is still ineffective due to symmetry agnostic encoding of atomic coordinates. We introduce CRYSIM, an algorithm that encodes the space group, the Wyckoff positions combination, and coordinates of independent atomic sites as separate variables. This encoding reduces the search space substantially by exploiting the symmetry in space groups. When CRYSIM is interfaced to Fixstars Amplify, a GPU-based Ising machine, its prediction performance was competitive with CALYPSO and Bayesian optimization for crystals containing more than 150 atoms in a unit cell. Although it is not realistic to interface CRYSIM to current small-scale quantum devices, it has the potential to become the standard CSP algorithm in the coming quantum age.


ContinuouSP: Generative Model for Crystal Structure Prediction with Invariance and Continuity

Tone, Yuji, Hanai, Masatoshi, Kawamura, Mitsuaki, Taura, Kenjiro, Suzumura, Toyotaro

arXiv.org Artificial Intelligence

The discovery of new materials using crystal structure prediction (CSP) based on generative machine learning models has become a significant research topic in recent years. In this paper, we study invariance and continuity in the generative machine learning for CSP. We propose a new model, called ContinuouSP, which effectively handles symmetry and periodicity in crystals. We clearly formulate the invariance and the continuity, and construct a model based on the energy-based model. Our preliminary evaluation demonstrates the effectiveness of this model with the CSP task.


CrySPAI: A new Crystal Structure Prediction Software Based on Artificial Intelligence

Wang, Zongguo, Chen, Ziyi, Yuan, Yang, Wang, Yangang

arXiv.org Artificial Intelligence

Crystal structure predictions based on the combination of first-principles calculations and machine learning have achieved significant success in materials science. However, most of these approaches are limited to predicting specific systems, which hinders their application to unknown or unexplored domains. In this paper, we present CrySPAI, a crystal structure prediction package developed using artificial intelligence (AI) to predict energetically stable crystal structures of inorganic materials given their chemical compositions. The software consists of three key modules, an evolutionary optimization algorithm (EOA) that searches for all possible crystal structure configurations, density functional theory (DFT) that provides the accurate energy values for these structures, and a deep neural network (DNN) that learns the relationship between crystal structures and their corresponding energies. To optimize the process across these modules, a distributed framework is implemented to parallelize tasks, and an automated workflow has been integrated into CrySPAI for seamless execution. This paper reports the development and implementation of AI AI-based CrySPAI Crystal Prediction Software tool and its unique features.